Author

Theresa Szczepanski

Published

October 22, 2023

Code
source('dependencies.R')
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
#install.packages("stargazer")
#library(stargazer)
#library(flexmix)

#library(qpcR)

Research Questions

The Massachusetts Education Reform Act in 1993 was passed in the context of a national movement toward education reform throughout the United States. As early as 1989 there were calls to establish national curriculum standards as a way to improve student college and career readiness skills and close poverty gaps (Greer 2018). Massachusetts Comprehensive Assessment System (MCAS) tests were introduced as part of the Massachusetts Education Reform Act.

The MCAS tests are a significant tool for educational equity. Scores on the Grade 10 Math MCAS test “predict longer-term educational attainments and labor market success, above and beyond typical markers of student advantage” and differences among students are largely and “sometimes completely accounted for” by differences in 10th grade MCAS scores and educational attainments. (Papy 2020).

With the introduction of the new Common Core standards and accountability testing came the demand for aligned curricular materials and teaching practices. Research indicates that the choice of instructional materials can have an impact “as large as or larger than the impact of teacher quality” (Chingos 2012). Massachusetts, along with Arkansas, Delaware, Kentucky, Louisiana, Maryland, Mississippi, Nebraska, New Mexico, Ohio, Rhode Island, Tennessee, and Texas belongs to the Council of Chief State School Officers’ (CCSO), High Quality Instructional Materials and Professional Development network which aims to close the “opportunity gap” among students by ensuring that every teacher has access to high-quality, standards aligned instructional materials and receives relevant professional development to support their use of these materials (Chief State School Officers 2021).

All Massachusetts Public School students must complete a high school science MCAS exam providing a wealth of standardized data on students’ discipline specific skill development. All schools receive annual summary reports on student performance. Significant work has been done using the MCAS achievement data and the Student Opportunity Act to identify achievement gaps and address funding inequities across the Commonwealth (Papy 2020). With funding gaps outlined in the late 1990’s closing, one could consider how the MCAS data could be leveraged to support the state’s current high quality instructional materials initiatives. The state compiles school’s performance disaggregated by each MCAS question item (DESE 2022).

Using the curricular information provided in state wide Next Generation MCAS High School Introductory Physics Item reports together with school-level student performance data, we hope to address the following broad questions:

  • Is there a relationship between differences in a school’s performance across Science Practice Categories and a school’s overall achievement on the Introductory Physics exam?

  • How can trends in a school’s performance be used to provide schools with guidance on discipline-specific curricular areas to target to improve student achievement?

In this report, I will analyze the High School Introductory Physics Next Generation Massachusetts Comprehensive Assessment System (MCAS) tests results for Massachusetts public schools.

Data for the study were drawn from DESE’s Next Generation MCAS Test Achievement Results statewide report, Item Analysis statewide report, and the MCAS digital item library. The Next Generation High School Introductory Physics MCAS assessment consists of 42 multiple choice and constructed response items that assess students on Physical Science standards from the 2016 STE Massachusetts Curriculum Framework in the content Reporting Categories of Motions and Forces, MF, Energy, EN, and Waves, WA. Each item is associated with a specific content standard from the Massachusetts Curriculum Framework as well as an underlying science Practice Category of Evidence Reasoning and Modeling, ERM, Mathematics and Data, MD, or Investigations and Questioning, IQ. The State Item Report provides the percentage of points earned by students in a school for each item as well as the percentage of points earned by all students in the state for each item.

The HSPhy_NextGen_SchoolSum data frame contains summary performance results from 112 public schools across the commonwealth on the Next Generation High School Introductory Physics MCAS, which was administered in the Spring of 2022 and 2023. 87 schools tested students in both years and 25 schools only tested students in 1 of the 2 testing years, with 27,745 students completing the exam.

For each school, there are values reported for 44 different variables which consist of information from three broad categories

  • School Characteristics: This includes the name of the school and the size of the school, School Size, as determined by the number of students that completed the MCAS exam.

  • Discipline-Specific Performance Metrics: This includes the percentage of points earned by students at a school for items in each content Reporting Category, MF%, EN%, WA% and science Practice Category ERM%, MD%, IQ%, the difference between a school’s percentage of points earned compared to the percentage of points earned by all students in the state (MFDiff, ENDiff, etc…), and the variability in a school’s performance relative to the state by category as measured by the standard deviation of the school’s Diff across categories (SD MF Diff, SD EN Diff, etc…).

  • Aggregate Performance Level metrics: This includes a school’s percentage of students at each of the four Performance Levels, (E%: Exceeding Expectations, M%: Meeting Expectations, PM%: Partially Meeting Expectations, and NM%: Not Meeting Expectations), the difference between these percentages and the percentage of students in Massachusetts at each performance level (EDiff, MDiff, PMDiff, NMDiff), and an ordinal classification of schools, EM Perf Stat, based on the percentage of students that were classified as Exceeding or Meeting expectations on the exam (HighEM, HighM, Mid, Mid-Low, Low).

See the HSPhy_NextGenMCASDF data frame summary and codebook for further details about all variables.

Hypotheses

  • A school’s percentage of students classified as Exceeding or Meeting expectations on the Introductory Physics MCAS is negatively associated with a school’s variance in performance relative to students in the state on Mathematics and Data items, SD MD Diff.

  • A school’s summary performance on items in a given content Reporting Category as measured by MF%, EN%, and WA%, is positively associated with the Reporting Category's weight within the exam.

Descriptive Statistics

Code
#HSPhy_NextGen_SchoolSum
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
  ungroup()

#HSPhy_NextGen_SchoolSum
# HSPhy_NextGen_PerfDF
# HSPhy_NextGen_SchoolIT301DF

HSPhy_2023_SchoolSizeDF<-read_excel("data/2023_Physics_NextGenMCASItem.xlsx", skip = 1)%>%
  select(`School Name`, `School Code`, `Tested`)%>%
  mutate(`Tested` = as.integer(`Tested`))%>%
  select(`School Name`, `School Code`, `Tested`)

HSPhy_2022_SchoolSizeDF<-read_excel("data/2022_Physics_NextGenMCASItem.xlsx", skip = 1)%>%
  select(`School Name`, `School Code`, `Tested`)%>%
  mutate(`Tested` = as.integer(`Tested`))%>%
  select(`School Name`, `School Code`, `Tested`)


HSPhy_SchoolSize <- rbind(HSPhy_2023_SchoolSizeDF, HSPhy_2022_SchoolSizeDF)%>%
  mutate(count = 1)%>%
  group_by(`School Name`, `School Code`)%>%
  summarise(count = sum(count),
            `Tested` = sum(`Tested`))%>%
  mutate(`Tested Count` = round(`Tested`/count))%>%
  ungroup()
#HSPhy_SchoolSize
quantile <- quantile(HSPhy_SchoolSize$`Tested Count`)
HSPhy_Size<-HSPhy_SchoolSize%>%
  mutate(`School Size` = case_when(
    `Tested Count` <= quantile[2] ~ 0, # small
    `Tested Count` > quantile[2] &
      `Tested Count` <= quantile[3] ~ 1, #Low-Mid
    `Tested Count` > quantile[3] &
      `Tested Count` <= quantile[4] ~ 2, #Upper Mid
    `Tested Count` > quantile[4] &
      `Tested Count` <= quantile[5] ~ 3, # Large
  ))%>%
  # mutate(`School Size` = recode_factor(`School Size`,
  #                                           "Small" = "Small",
  #                                           "Low-Mid" = "Low-Mid",
  #                                           "Upper-Mid" = "Upper-Mid",
  #                                           "Large" = "Large",
  #                                           .ordered = TRUE))%>%
  select(`School Name`, `School Code`, `School Size`)


#HSPhy_Size

HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
  left_join(HSPhy_Size, by = c("School Name" = "School Name", "School Code" = "School Code"))%>%
  mutate(`EMDiff` = `EDiff` + `MDiff`)%>%
  mutate(`EM Perf Stat` = case_when(
    `EDiff` > 0 & `EDiff` + `MDiff` > 0 ~ "HighEM",
    `EDiff` <= 0 & `EDiff` + `MDiff` > 0 ~ "HighM",
    #`EMDiff` > quantile(HSPhy_NextGen_SchoolSum$`EMDiff`)[3] & 
      `EMDiff` <= 0  & `EMDiff` > -14 ~ "Mid",
    `EMDiff` <= -14 & `EMDiff` >= -33  ~ "Mid-Low",
    `EMDiff` < -33  ~ "Low"
   
  ))%>%
  mutate(`EM Perf Stat` = recode_factor(`EM Perf Stat`,
                                 "HighEM" = "HighEM",
                                 "HighM" = "HighM",
                                 "Mid" = "Mid",
                                 "Mid-Low" = "Mid-Low",
                                 "Low" = "Low",
                                 .ordered = TRUE))
HSPhy_NextGen_SchoolSum
Code
#quantile(HSPhy_NextGen_SchoolSum$`EMDiff`)

                                      

#summary(HSPhy_NextGen_SchoolSum)
print(summarytools::dfSummary(HSPhy_NextGen_SchoolSum,
                         varnumbers = FALSE,
                         plain.ascii  = FALSE,
                         style        = "grid",
                         graph.magnif = 0.70,
                        valid.col    = FALSE),
       method = 'render',
       table.classes = 'table-condensed')

Data Frame Summary

HSPhy_NextGen_SchoolSum

Dimensions: 112 x 40
Duplicates: 0
Variable Stats / Values Freqs (% of Valid) Graph Missing
Subject [character] 1. PHY
112 ( 100.0% )
0 (0.0%)
School Name [character]
1. Academy Of the Pacific Ri
2. Agawam High
3. Andover High
4. Another Course To College
5. Arlington High
6. Assabet Valley Vocational
7. Athol High
8. Atlantis Charter School
9. Attleboro High
10. B M C Durfee High
[ 102 others ]
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
102 ( 91.1% )
0 (0.0%)
School Code [character]
1. 00050505
2. 00090505
3. 00100505
4. 00160505
5. 00200505
6. 00230505
7. 00260505
8. 00350426
9. 00350505
10. 00350507
[ 102 others ]
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
1 ( 0.9% )
102 ( 91.1% )
0 (0.0%)
EN% [numeric]
Mean (sd) : 47.7 (14)
min ≤ med ≤ max:
19 ≤ 47 ≤ 80
IQR (CV) : 19.5 (0.3)
50 distinct values 0 (0.0%)
MF% [numeric]
Mean (sd) : 52.2 (13.5)
min ≤ med ≤ max:
22 ≤ 53 ≤ 79
IQR (CV) : 18.2 (0.3)
48 distinct values 0 (0.0%)
WA% [numeric]
Mean (sd) : 45 (13.1)
min ≤ med ≤ max:
20 ≤ 44 ≤ 79
IQR (CV) : 17.2 (0.3)
47 distinct values 0 (0.0%)
EN Diff SD [numeric]
Mean (sd) : 8.5 (3.4)
min ≤ med ≤ max:
2.5 ≤ 7.6 ≤ 20
IQR (CV) : 4.2 (0.4)
108 distinct values 0 (0.0%)
MF Diff SD [numeric]
Mean (sd) : 8.7 (3.1)
min ≤ med ≤ max:
3.4 ≤ 8.3 ≤ 20.1
IQR (CV) : 4.1 (0.4)
106 distinct values 0 (0.0%)
WA Diff SD [numeric]
Mean (sd) : 8.3 (2.9)
min ≤ med ≤ max:
2.9 ≤ 7.8 ≤ 15.2
IQR (CV) : 3.6 (0.3)
106 distinct values 0 (0.0%)
IQ% [numeric]
Mean (sd) : 49.4 (16.4)
min ≤ med ≤ max:
3 ≤ 50 ≤ 86
IQR (CV) : 24.2 (0.3)
55 distinct values 0 (0.0%)
MD% [numeric]
Mean (sd) : 47.2 (14.4)
min ≤ med ≤ max:
17 ≤ 47 ≤ 80
IQR (CV) : 21.2 (0.3)
48 distinct values 0 (0.0%)
ERM% [numeric]
Mean (sd) : 53 (12.6)
min ≤ med ≤ max:
26 ≤ 52.5 ≤ 80
IQR (CV) : 17.2 (0.2)
49 distinct values 0 (0.0%)
None% [numeric]
Mean (sd) : 45.4 (13.1)
min ≤ med ≤ max:
16 ≤ 45 ≤ 77
IQR (CV) : 18 (0.3)
48 distinct values 0 (0.0%)
IQ Diff SD [numeric]
Mean (sd) : 6.6 (4.4)
min ≤ med ≤ max:
0.7 ≤ 5.5 ≤ 24.8
IQR (CV) : 5.4 (0.7)
66 distinct values 10 (8.9%)
MD Diff SD [numeric]
Mean (sd) : 8.6 (3.5)
min ≤ med ≤ max:
3.8 ≤ 7.9 ≤ 23.5
IQR (CV) : 4.1 (0.4)
107 distinct values 0 (0.0%)
ERM Diff SD [numeric]
Mean (sd) : 8.6 (2.9)
min ≤ med ≤ max:
3.4 ≤ 8.2 ≤ 17
IQR (CV) : 4.2 (0.3)
101 distinct values 0 (0.0%)
None Diff SD [numeric]
Mean (sd) : 8.7 (2.9)
min ≤ med ≤ max:
3 ≤ 8.4 ≤ 16.4
IQR (CV) : 3.9 (0.3)
107 distinct values 0 (0.0%)
Tested Students [integer]
Mean (sd) : 239.5 (245.9)
min ≤ med ≤ max:
10 ≤ 140.5 ≤ 1009
IQR (CV) : 225.8 (1)
95 distinct values 0 (0.0%)
E% [numeric]
Mean (sd) : 8.6 (12)
min ≤ med ≤ max:
0 ≤ 4 ≤ 60
IQR (CV) : 9.2 (1.4)
29 distinct values 0 (0.0%)
M% [numeric]
Mean (sd) : 30.4 (17.7)
min ≤ med ≤ max:
0 ≤ 32.5 ≤ 71
IQR (CV) : 27.2 (0.6)
50 distinct values 0 (0.0%)
PM% [numeric]
Mean (sd) : 43.6 (16.8)
min ≤ med ≤ max:
0 ≤ 46.5 ≤ 93
IQR (CV) : 20.5 (0.4)
53 distinct values 0 (0.0%)
NM% [numeric]
Mean (sd) : 17.3 (19.1)
min ≤ med ≤ max:
0 ≤ 10 ≤ 89
IQR (CV) : 21.2 (1.1)
44 distinct values 0 (0.0%)
E%State [numeric] 1 distinct value
14 : 112 ( 100.0% )
0 (0.0%)
M%State [numeric] 1 distinct value
36 : 112 ( 100.0% )
0 (0.0%)
PM%State [numeric] 1 distinct value
38 : 112 ( 100.0% )
0 (0.0%)
NM%State [numeric] 1 distinct value
12 : 112 ( 100.0% )
0 (0.0%)
EDiff [numeric]
Mean (sd) : -5.4 (12)
min ≤ med ≤ max:
-14 ≤ -10 ≤ 46
IQR (CV) : 9.2 (-2.2)
29 distinct values 0 (0.0%)
MDiff [numeric]
Mean (sd) : -5.6 (17.7)
min ≤ med ≤ max:
-36 ≤ -3.5 ≤ 35
IQR (CV) : 27.2 (-3.2)
50 distinct values 0 (0.0%)
PMDiff [numeric]
Mean (sd) : 5.6 (16.8)
min ≤ med ≤ max:
-38 ≤ 8.5 ≤ 55
IQR (CV) : 20.5 (3)
53 distinct values 0 (0.0%)
NMDiff [numeric]
Mean (sd) : 5.3 (19.1)
min ≤ med ≤ max:
-12 ≤ -2 ≤ 77
IQR (CV) : 21.2 (3.6)
44 distinct values 0 (0.0%)
EN%State [numeric] 1 distinct value
54 : 112 ( 100.0% )
0 (0.0%)
MF%State [numeric] 1 distinct value
58 : 112 ( 100.0% )
0 (0.0%)
WA%State [numeric] 1 distinct value
51 : 112 ( 100.0% )
0 (0.0%)
IQ%State [numeric] 1 distinct value
55 : 112 ( 100.0% )
0 (0.0%)
MD%State [numeric] 1 distinct value
54 : 112 ( 100.0% )
0 (0.0%)
ERM%State [numeric] 1 distinct value
58 : 112 ( 100.0% )
0 (0.0%)
None%State [numeric] 1 distinct value
51 : 112 ( 100.0% )
0 (0.0%)
School Size [numeric]
Mean (sd) : 1.5 (1.1)
min ≤ med ≤ max:
0 ≤ 1 ≤ 3
IQR (CV) : 1.5 (0.8)
0 : 28 ( 25.0% )
1 : 29 ( 25.9% )
2 : 27 ( 24.1% )
3 : 28 ( 25.0% )
0 (0.0%)
EMDiff [numeric]
Mean (sd) : -10.9 (26.6)
min ≤ med ≤ max:
-50 ≤ -14 ≤ 50
IQR (CV) : 39 (-2.4)
64 distinct values 0 (0.0%)
EM Perf Stat [ordered, factor]
1. HighEM
2. HighM
3. Mid
4. Mid-Low
5. Low
22 ( 19.6% )
16 ( 14.3% )
18 ( 16.1% )
32 ( 28.6% )
24 ( 21.4% )
0 (0.0%)

Generated by summarytools 1.0.1 (R version 4.2.2)
2023-11-26

Key Variables

To explore the relationship between the distribution of school’s students’ Performance Level and school’s performance in content categories, we examine the percentage of points earned by students at schools as well as the standard deviation of the difference between points earned by students at a school and points earned by students in the state across Reporting Categories and Practice Categories. We grouped schools by their EM Perf Stat, an ordinal variable classifying schools by the percentage of students they have that were classified as either Exceeding or Meeting expectations on the MCAS. These numbers seem to suggest that items classified with the Science Practice Category of Mathematics and Data seem to be more challenging to students than those classified as Evidence, Reasoning, and Modeling. These practice categories are strongly and equally emphasized within the exam; items tagged with these categories account for 82% of the available points on the exam with exactly 41% of available points coming from each category.

When considering content Reporting Categories, there do not seem to be discernible distinctions between EM Perf Stat and school’s achievement and performance across categories. All schools seem to perform the strongest on Motion and Forces items, followed by Energy, and weakest on Waves items. Notably, this is also the order of the relative weights of the content areas within the exam; MF, EN, and WA items account for 50%, 30%, and 20% of exam points respectively.

Code
 #quantile(HSPhy_NextGen_SchoolSum$`EMDiff`)


 
HSPhy_NextGen_SchoolSum%>%
  group_by(`EM Perf Stat`)%>%
    summarise( `Mean MD%` = mean(`MD%`), 
              `Mean MD SD` = mean(`MD Diff SD`),
              `Mean ERM%` = mean(`ERM%`),
               `Mean ERM SD` = mean (`ERM Diff SD`))
Code
HSPhy_NextGen_SchoolSum%>%
  group_by(`EM Perf Stat`)%>%
    summarise( `Mean MF%` = mean(`MF%`), 
              `Mean MF SD` = mean(`MF Diff SD`),
              `Mean EN%` = mean(`EN%`),
               `Mean EN SD` = mean (`EN Diff SD`),
              `Mean WA%` = mean(`WA%`),
               `Mean WA SD` = mean (`WA Diff SD`)
              )

Visualization

Distribution of Performance Level %

When examining the statewide performance distribution, we can see from the right-skew that it is rare for schools to have high percentages of students classified as Not Meeting expectations and even rarer for schools to have high percentages of students classified as Exceeding expectations.

Code
HSPhy_NextGen_SchoolSum%>%
  select(`E%`, `M%`, `PM%`, `NM%`)%>%
  pivot_longer(c(1:4), names_to = "Performance Level", values_to = "% Students")%>%
   ggplot( aes(x=`% Students`, color=`Performance Level`, fill=`Performance Level`)) +
    geom_histogram(alpha=0.6, binwidth = 15) +
    scale_fill_viridis(discrete=TRUE) +
    scale_color_viridis(discrete=TRUE) +
    #theme_ipsum() +
    theme(
      legend.position="none",
      panel.spacing = unit(0.1, "lines"),
      strip.text.x = element_text(size = 8)
    ) +
  
    facet_wrap(~`Performance Level`)+
      labs( y = "",
            title = "School Performance Level Distribution",
            x = "% Students at Performance Level",
            caption = "NextGen HS Physics MCAS")

Distribution of School Performance and Variability by Practice Cat

Although Mathematics and Data and Evidence, Reasoning, and Modeling items have strong and equal weighting in the HS Introductory Physics exam, student performance distributions are noticeably different across these practice categories.

Code
HSPhy_NextGen_SchoolSum%>%
  select(`ERM%`, `MD%`)%>%
  pivot_longer(c(1:2), names_to = "Practice Cat", values_to = "% Points")%>%
   ggplot( aes(x=`% Points`, color=`Practice Cat`, fill=`Practice Cat`)) +
    geom_histogram(alpha=0.6, binwidth = 3) +
    scale_fill_viridis(discrete=TRUE) +
    scale_color_viridis(discrete=TRUE) +
    #theme_ipsum() +
    theme(
      panel.spacing = unit(0.1, "lines"),
      strip.text.x = element_text(size = 8)
    ) +
  
    facet_wrap(~`Practice Cat`)+
      labs( y = "",
            title = "School Performance by Practice Category",
            x = "% Points Earned",
            caption = "NextGen HS Physics MCAS")

Code
  #ggtitle("Practice Category Performance")

When considering the variability of a school’s performance on items relative to the state by Practice Category, SD MD Diff, and SD ERM Diff, we can see that Mathematics and Data is skewed more to the right.

Code
  HSPhy_NextGen_SchoolSum%>%
  select(`ERM Diff SD`, `MD Diff SD`)%>%
  pivot_longer(c(1:2), names_to = "Practice Cat", values_to = "SD Diff")%>%
   ggplot( aes(x=`SD Diff`, color=`Practice Cat`, fill=`Practice Cat`)) +
    geom_histogram(alpha=0.6, binwidth = 3) +
    scale_fill_viridis(discrete=TRUE) +
    scale_color_viridis(discrete=TRUE) +
   # theme_ipsum() +
    theme(
      panel.spacing = unit(0.1, "lines"),
      strip.text.x = element_text(size = 8)
    ) +
      labs( y = "",
            title = "School Performance Variation by Practice Category",
            x = "SD Diff",
            caption = "NextGen HS Physics MCAS") +
    facet_wrap(~`Practice Cat`)

Mathematics and Data vs. Evidence Reasoning and Modeling (Practice Category)

These images, seem to suggest that schools with the highest percentage of students classified as Exceeding expectations on the MCAS have the lowest levels of variation in performance on Mathematics and Data Items and schools with the lowest percentage of students classified as Exceeding expectations on the MCAS have the highest levels of variation in performance on Mathematics and Data Items.

Code
HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `ERM Diff SD`, `MD Diff SD` )%>%
  pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
  ggplot( aes(x= `EM Perf Stat`, y=`SD Diff`, fill= `EM Perf Stat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    
    theme(
      plot.title = element_text(size=11),
      axis.title.x=element_blank(),
      #axis.text.x=element_blank()
    ) +
 
    labs( y = "SD Diff",
            title = "Student Performance Variation by Practice Category",
            x = "",
            caption = "NextGen HS Physics MCAS") +
  facet_wrap(~`Practice Cat`)

Code
HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `ERM Diff SD`, `MD Diff SD` )%>%
  pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
  ggplot( aes(x= `Practice Cat`, y=`SD Diff`, fill= `Practice Cat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    #theme_ipsum() +
    theme(
     
      plot.title = element_text(size=11),
      axis.title.x=element_blank(),
      axis.text.x=element_blank()
    ) +
     labs( y = "SD Diff",
            title = "Student Practice Cat. Variation by Achievement Level",
            x = "",
            caption = "NextGen HS Physics MCAS") +
    #xlab("")+
  facet_wrap(~`EM Perf Stat`)

These images, seem to suggest that students at all schools seem to have more difficulty with Mathematics and Data items as compared to Evidence, Reasoning, and Modeling Items.

Code
HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `ERM%`, `MD%` )%>%
  pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
  ggplot( aes(x= `EM Perf Stat`, y=`%Points`, fill= `EM Perf Stat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    #theme_ipsum() +
    theme(
      
      plot.title = element_text(size=11)
    ) +
    labs( y = "%Points Earned",
            title = "Student Practice Cat. Achievement by Performance Level",
            x = "",
            caption = "NextGen HS Physics MCAS") +
    #xlab("")+
  facet_wrap(~`Practice Cat`)

Code
HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `ERM%`, `MD%` )%>%
  pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
  ggplot( aes(x= `Practice Cat`, y=`%Points`, fill= `Practice Cat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    #theme_ipsum() +
    theme(
      
      plot.title = element_text(size=11)
    ) +
    labs( y = "%Points Earned",
            title = "Student Practice Cat. Achievement by Performance Level",
            x = "",
            caption = "NextGen HS Physics MCAS") +
    #xlab("")+
  facet_wrap(~`EM Perf Stat`, scale ="free_y")

Distribution of School Performance and Variability by Reporting Cat

Here we can visualize the variability of a school’s performance on items partitioned by Content Reporting Category of Motion and Forces, Energy, and Waves via: MF%/SD MF Diff, EN%/SD EN Diff, and WA%/SD WA Diff.

Code
  HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
  pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
  ggplot( aes(x=`SD Diff`, color=`Report Cat`, fill=`Report Cat`)) +
    geom_histogram(alpha=0.6, binwidth = 3) +
    scale_fill_viridis(discrete=TRUE) +
    scale_color_viridis(discrete=TRUE) +
    #theme_ipsum() +
    theme(
      panel.spacing = unit(0.1, "lines"),
      strip.text.x = element_text(size = 8)
    ) +
      labs( y = "",
            title = "School Performance Variation by Content Reporting Category",
            x = "SD Diff",
            caption = "NextGen HS Physics MCAS") +
  facet_wrap(~`Report Cat`)

Code
HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `MF%`, `EN%`, `WA%` )%>%
  pivot_longer(c(2:4), names_to = "Report Cat", values_to = "% Points")%>%
 ggplot( aes(x=`% Points`, color=`Report Cat`, fill=`Report Cat`)) +
    geom_histogram(alpha=0.6, binwidth = 3) +
    scale_fill_viridis(discrete=TRUE) +
    scale_color_viridis(discrete=TRUE) +
    #theme_ipsum() +
    theme(
      panel.spacing = unit(0.1, "lines"),
      strip.text.x = element_text(size = 8)
    ) +
  
    facet_wrap(~`Report Cat`)+
      labs( y = "",
            title = "Student Performance by Content Reporting Category",
            x = "% Points Earned",
            caption = "NextGen HS Physics MCAS")

Code
  #ggtitle("Practice Category Performance")

Motion and Forces vs. Energy vs. Waves (Reporting Category)

These images suggest that most schools exhibit similar levels of variability in performance relative to the state across all reporting categories. Schools with the lowest percentage of students Exceeding expectations exhibit high variability in performance across all content reporting categories, but seem to have lower variability on Waves items.

Code
HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
  pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
  ggplot( aes(x= `EM Perf Stat`, y=`SD Diff`, fill= `EM Perf Stat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    
    theme(
      plot.title = element_text(size=11),
      axis.title.x=element_blank(),
      axis.text.x=element_blank()
    ) +
 
    labs( y = "SD Diff",
            title = "School Performance Variation by Content Reporting Category",
            x = "",
            caption = "NextGen HS Physics MCAS") +
  facet_wrap(~`Report Cat`)

Code
HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `MF Diff SD`, `EN Diff SD`, `WA Diff SD` )%>%
  pivot_longer(c(2:4), names_to = "Report Cat", values_to = "SD Diff")%>%
  ggplot( aes(x= `Report Cat`, y=`SD Diff`, fill= `Report Cat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    #theme_ipsum() +
    theme(
     
      plot.title = element_text(size=11),
      axis.title.x=element_blank(),
      axis.text.x=element_blank()
    ) +
     labs( y = "SD Diff",
            title = "School Content Reporting Cat. Variation by Achievement Level",
            x = "",
            caption = "NextGen HS Physics MCAS") +
    #xlab("")+
  facet_wrap(~`EM Perf Stat`)

Code
HSPhy_NextGen_SchoolSum%>%
  select(`EM Perf Stat`, `MF%`, `EN%`, `WA%` )%>%
  pivot_longer(c(2:4), names_to = "Report Cat", values_to = "% Points")%>%
  ggplot( aes(x= `Report Cat`, y=`% Points`, fill= `Report Cat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    #theme_ipsum() +
    theme(

      plot.title = element_text(size=11),
      axis.title.x=element_blank(),
      axis.text.x=element_blank()
    ) +
     labs( y = "Report Cat%",
            title = "School Content Reporting Cat. Performance by Achievement Level",
            x = "",
            caption = "NextGen HS Physics MCAS") +
    #xlab("")+
  facet_wrap(~`EM Perf Stat`)

Code
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
  ungroup()

HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
  mutate(`EorM%` = `E%` + `M%`)

#HSPhy_NextGen_SchoolSum

Visualization of School Size, EorM%, MD Diff, ERM Diff

To explore the relationship between the variance in a schools’ Diff compared to the state on Mathematics and Data items, MD Diff SD and a school’s percentage of students meeting or exceeding expectations on the MCAS, EorM%, we also will considered the impact of School Size, 0: Smallest, 3: Largest schools, as a control. It appears from our visuals that Small schools have a higher variation in mathematics and data items smaller school’s typically perform worse on Mathematics and Data and overall on the MCAS compared to larger schools.

Code
HSPhy_NextGen_SchoolSum%>%
  group_by(`School Size`)%>%
  summarize(
    `Mean EorM%` = mean(`EorM%`),
      `Mean MD%` = mean(`MD%`),
     `Mean MD Diff SD` = mean(`MD Diff SD`),
     `Mean ERM%` = mean(`ERM%`),
     `Mean ERM Diff SD` = mean(`ERM Diff SD`)
             )
Code
HSPhy_NextGen_SchoolSum%>%
  select(`School Size`, `MD Diff SD`, `ERM Diff SD` )%>%
  pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
  ggplot( aes(x= `Practice Cat`, y=`SD Diff`, fill= `Practice Cat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    #theme_ipsum() +
    theme(

      plot.title = element_text(size=11),
      axis.title.x=element_blank(),
      axis.text.x=element_blank()
    ) +
     labs( y = "SD Diff",
            title = "Student Practice Cat. Variation by School Size",
            x = "",
            caption = "NextGen HS Physics MCAS") +
    #xlab("")+
  facet_wrap(~`School Size`, scale = "free")

Code
HSPhy_NextGen_SchoolSum%>%
  select(`School Size`, `MD%`, `ERM%` )%>%
  pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "%Points")%>%
  ggplot( aes(x= `Practice Cat`, y=`%Points`, fill= `Practice Cat`)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6) +
    geom_jitter(color="black", size=0.4, alpha=0.9) +
    #theme_ipsum() +
    theme(

      plot.title = element_text(size=11),
      axis.title.x=element_blank(),
      axis.text.x=element_blank()
    ) +
     labs( y = "% Points Earned",
            title = "Student Practice Cat. Achievement by School Size",
            x = "",
            caption = "NextGen HS Physics MCAS") +
    #xlab("")+
  facet_wrap(~`School Size`)

However, when you group the schools by EM Perf Stat, you find that the highest performing, High EM, Small schools have a higher percentage of students meeting or exceeding expectations.

Across all sizes, it seems that the weakest performing schools have more variation in mathematics and data and the strongest performing schools have less variability in Mathematics and Data than in Evidence, Reasoning, and Modeling.

Code
# Faceted by performance level
  HSPhy_NextGen_SchoolSum%>%
  group_by(`School Size`, `EM Perf Stat`)%>%
  summarize(`Mean EorM%` = mean(`EorM%`),
      `Mean MD%` = mean(`MD%`),
     `Mean MD Diff SD` = mean(`MD Diff SD`),
     `Mean ERM%` = mean(`ERM%`),
     `Mean ERM Diff SD` = mean(`ERM Diff SD`)
             )
Code
# HSPhy_NextGen_SchoolSum%>%
#   select(`School Size`, `EorM%`, `MD Diff SD`, `EM Perf Stat` )%>%
#   #pivot_longer(c(2:3), names_to = "Practice Cat", values_to = "SD Diff")%>%
#   ggplot( aes(x= `School Size`, y=`EorM%`, fill= `School Size`)) +
#     geom_boxplot() +
#     scale_fill_viridis(discrete = TRUE, alpha=0.6) +
#     geom_jitter(color="black", size=0.4, alpha=0.9) +
#     #theme_ipsum() +
#     theme(
#      
#       plot.title = element_text(size=11),
#       axis.title.x=element_blank(),
#       axis.text.x=element_blank()
#     ) +
#      labs( y = "SD Diff",
#             title = "Students Meeting or Exceeding Expectations . Variation by Achievement Level",
#             x = "",
#             caption = "NextGen HS Physics MCAS") +
#     #xlab("")+
#   facet_wrap(~`EM Perf Stat`, scales = "free")

Hypothesis 1: Variation in Mathematics

  • A school’s percentage of students classified as Exceeding or Meeting expectations on the Introductory Physics MCAS is negatively associated with a school’s variance in performance relative to students in the state on Mathematics and Data items, SD MD Diff.

I fit several models with EorM% as the response variable and MD Diff SD as the key explanatory variable.

Then, I will consider controlling for the following:

Control for the variability in the other largest Science Practice Category, Evidence, Reasoning, and Modeling * Explanatory: ERM Diff SD

Control for School Size

  • School Size
  • Explanatory: School Size

Control for the variability in the other content Reporting Categories: Waves, Motion and Forces, and Energy

  • Explanatory: WA Diff SD
  • Explanatory: MF Diff SD
  • Explanatory: EN Diff SD

Fit 1: Explanatory: MD Diff SD

When we consider MD Diff SD without controlling for other variables, we have

  • \(p-value: 2.726e-09\)
  • adjusted \(r^2 = .27\)
  • AIC = 1021.3
  • BIC = 1029.5
Code
fit_md = lm(`EorM%` ~ (`MD Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD`), data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-32.575 -18.347  -4.778  15.525  81.212 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   73.1813     5.6876  12.867  < 2e-16 ***
`MD Diff SD`  -3.9674     0.6127  -6.476 2.73e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.71 on 110 degrees of freedom
Multiple R-squared:  0.276, Adjusted R-squared:  0.2694 
F-statistic: 41.94 on 1 and 110 DF,  p-value: 2.726e-09
Code
AIC(fit_md)
[1] 1021.306
Code
BIC(fit_md)
[1] 1029.462

Fit 2: Explanatory: MD Diff SD + ERM Diff SD

When we control for the variation in the other Science Practice Category, Evidence Reasoning, and Modeling, ERM Diff SD is not significant and does not have a significant interaction with MD Diff SD

Code
fit_md_erm = lm(`EorM%` ~ (`MD Diff SD` + `ERM Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_erm)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `ERM Diff SD`), data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-32.499 -18.077  -4.584  16.254  81.167 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)    74.0590     6.7612  10.953  < 2e-16 ***
`MD Diff SD`   -3.7413     1.1166  -3.351  0.00111 ** 
`ERM Diff SD`  -0.3279     1.3517  -0.243  0.80876    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.81 on 109 degrees of freedom
Multiple R-squared:  0.2764,    Adjusted R-squared:  0.2631 
F-statistic: 20.82 on 2 and 109 DF,  p-value: 2.201e-08
Code
fit_md_erm_interact = lm(`EorM%` ~ (`MD Diff SD` + `ERM Diff SD` + `MD Diff SD` * `ERM Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_erm_interact)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `ERM Diff SD` + `MD Diff SD` * 
    `ERM Diff SD`), data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-33.324 -17.081  -4.182  13.659  82.455 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                 95.5577    16.4478   5.810  6.4e-08 ***
`MD Diff SD`                -6.9179     2.4804  -2.789  0.00625 ** 
`ERM Diff SD`               -2.1027     1.8288  -1.150  0.25279    
`MD Diff SD`:`ERM Diff SD`   0.2558     0.1785   1.432  0.15491    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.7 on 108 degrees of freedom
Multiple R-squared:  0.2899,    Adjusted R-squared:  0.2702 
F-statistic:  14.7 on 3 and 108 DF,  p-value: 4.302e-08

Fit 3: Explanatory: MD Diff SD + WA Diff SD

When we control for the variation in the Content Reporting Category, Waves: WA Diff SD

Motion and Forces: MD SD Diff, Energy: EN Diff SD, and Waves: WA Diff SD, without considering interaction, there is not a statistical significance to WA Diff SD. However, when we include the interaction between WA Diff SD and MD Diff SD

we have that MD Diff SD is still statistically significant and there is a statistically significant, \(p = 039\) interaction effect between WA Diff SD and MD Diff SD. For our model we have:

  • \(p-value: 2.726e-09\)
  • adjusted \(r^2 = .28\)
  • AIC = 1018
  • BIC = 1031
Code
fit_md_wa = lm(`EorM%` ~ (`MD Diff SD` + `WA Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_wa)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `WA Diff SD`), data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-31.508 -17.922  -6.025  14.653  74.863 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   67.3043     6.6291  10.153  < 2e-16 ***
`MD Diff SD`  -5.1507     0.9278  -5.551 2.02e-07 ***
`WA Diff SD`   1.9300     1.1438   1.687   0.0944 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.52 on 109 degrees of freedom
Multiple R-squared:  0.2944,    Adjusted R-squared:  0.2815 
F-statistic: 22.74 on 2 and 109 DF,  p-value: 5.561e-09
Code
fit_md_wa_interact= lm(`EorM%` ~ (`MD Diff SD` + `WA Diff SD` + `MD Diff SD`*`WA Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_wa_interact)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `WA Diff SD` + `MD Diff SD` * 
    `WA Diff SD`), data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-33.160 -16.523  -5.964  14.101  69.660 

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)               104.1191    18.7929   5.540 2.15e-07 ***
`MD Diff SD`              -10.0194     2.5033  -4.002 0.000115 ***
`WA Diff SD`               -1.7848     2.1050  -0.848 0.398373    
`MD Diff SD`:`WA Diff SD`   0.4548     0.2177   2.089 0.039048 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.18 on 108 degrees of freedom
Multiple R-squared:  0.3218,    Adjusted R-squared:  0.303 
F-statistic: 17.09 on 3 and 108 DF,  p-value: 3.762e-09
Code
AIC(fit_md_wa_interact)
[1] 1017.981
Code
BIC(fit_md_wa_interact)
[1] 1031.574

When we consider adding a second content reporting category, Motion and Forces, only MD Diff SD is statistically significant. So we will control for content with just one reporting category, Waves.

Code
fit_md_wa_mf = lm(`EorM%` ~ (`MD Diff SD` + `WA Diff SD` + `MD Diff SD`*`WA Diff SD` + `MF Diff SD` + `MD Diff SD`*`MF Diff SD`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_wa_mf)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `WA Diff SD` + `MD Diff SD` * 
    `WA Diff SD` + `MF Diff SD` + `MD Diff SD` * `MF Diff SD`), 
    data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-31.593 -16.746  -6.218  12.860  73.201 

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)               100.9689    19.1385   5.276 7.05e-07 ***
`MD Diff SD`               -8.3768     3.4313  -2.441   0.0163 *  
`WA Diff SD`                2.0642     3.1942   0.646   0.5195    
`MF Diff SD`               -4.2451     2.5290  -1.679   0.0962 .  
`MD Diff SD`:`WA Diff SD`   0.1080     0.3061   0.353   0.7249    
`MD Diff SD`:`MF Diff SD`   0.2522     0.1964   1.284   0.2019    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.09 on 106 degrees of freedom
Multiple R-squared:  0.3397,    Adjusted R-squared:  0.3086 
F-statistic: 10.91 on 5 and 106 DF,  p-value: 1.754e-08

Fit 4: Explanatory: MD Diff SD + WA Diff SD, School Size

When we add the explanatory variable School Size to our model, WA Diff SD no longer has a statistically significant interaction with MD Diff SD.

Code
fit_md_wa_size = lm(`EorM%` ~ (`MD Diff SD` + `WA Diff SD` +  `MD Diff SD`*`WA Diff SD` + `School Size`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_wa_size)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `WA Diff SD` + `MD Diff SD` * 
    `WA Diff SD` + `School Size`), data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-34.140 -17.120  -6.408  14.187  70.851 

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)                96.2026    26.0214   3.697 0.000346 ***
`MD Diff SD`               -9.5141     2.7609  -3.446 0.000814 ***
`WA Diff SD`               -1.2777     2.4047  -0.531 0.596279    
`School Size`               1.3307     3.0130   0.442 0.659627    
`MD Diff SD`:`WA Diff SD`   0.4215     0.2311   1.824 0.070985 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.26 on 107 degrees of freedom
Multiple R-squared:  0.3231,    Adjusted R-squared:  0.2978 
F-statistic: 12.77 on 4 and 107 DF,  p-value: 1.57e-08
Code
fit_md_size_interact = lm(`EorM%` ~ (`MD Diff SD`  + `School Size` + `MD Diff SD`* `School Size`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_size_interact)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `School Size` + `MD Diff SD` * 
    `School Size`), data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-35.782 -14.364  -6.129  13.415  76.444 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                 62.8319    10.5933   5.931 3.67e-08 ***
`MD Diff SD`                -2.8648     0.8741  -3.277  0.00141 ** 
`School Size`               14.8102     5.2528   2.819  0.00572 ** 
`MD Diff SD`:`School Size`  -2.0972     0.6858  -3.058  0.00281 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.98 on 108 degrees of freedom
Multiple R-squared:  0.3343,    Adjusted R-squared:  0.3158 
F-statistic: 18.07 on 3 and 108 DF,  p-value: 1.413e-09
Code
fit_md_size_interact = lm(`EorM%` ~ (`MD Diff SD` + `School Size` + `WA Diff SD` + `WA Diff SD`*`MD Diff SD` +  `MD Diff SD` * `School Size`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_size_interact)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `School Size` + `WA Diff SD` + 
    `WA Diff SD` * `MD Diff SD` + `MD Diff SD` * `School Size`), 
    data = HSPhy_NextGen_SchoolSum)

Residuals:
   Min     1Q Median     3Q    Max 
-37.86 -15.10  -5.69  11.87  71.30 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)   
(Intercept)                  2.7847    40.9048   0.068  0.94585   
`MD Diff SD`                 0.8114     4.4535   0.182  0.85578   
`School Size`               26.1137     9.0375   2.889  0.00468 **
`WA Diff SD`                 6.3866     3.5226   1.813  0.07265 . 
`MD Diff SD`:`WA Diff SD`   -0.3927     0.3591  -1.093  0.27670   
`MD Diff SD`:`School Size`  -3.1351     1.0822  -2.897  0.00458 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.53 on 106 degrees of freedom
Multiple R-squared:  0.3727,    Adjusted R-squared:  0.3432 
F-statistic:  12.6 on 5 and 106 DF,  p-value: 1.318e-09

Fit 5: Explanatory: MD Diff SD School Size

When we control for the variation in School Size and remove content reporting categories from our model, we have that MD Diff SD: \(p =.001\) and School Size: \(p = .006\) are both statistically significant and there is a statistically significant, \(p = 003\), interaction effect between MD Diff SD and School Size. For our model we have:

  • \(p-value: 1.41e-09\)
  • adjusted \(r^2 = .32\)
  • AIC = 1016
  • BIC = 1029

Since this model has the lowest AIC, 2nd lowest BIC, and highest adjusted \(r^2\), all terms have statistical significance, and the model has a \(p-value: 1.41e-09\), we can conclude

  • A school’s percentage of students classified as Exceeding or Meeting expectations on the Introductory Physics MCAS is negatively associated with a school’s variance in performance relative to students in the state on Mathematics and Data items, SD MD Diff, even when we control for impact of School Size on a school’s percentage of students Exceeding or Meeting expectations on the Introductory Physics MCAS.
Code
fit_md_size_interact = lm(`EorM%` ~ (`MD Diff SD`  + `School Size` + `MD Diff SD`* `School Size`), data = HSPhy_NextGen_SchoolSum)
summary(fit_md_size_interact)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `School Size` + `MD Diff SD` * 
    `School Size`), data = HSPhy_NextGen_SchoolSum)

Residuals:
    Min      1Q  Median      3Q     Max 
-35.782 -14.364  -6.129  13.415  76.444 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                 62.8319    10.5933   5.931 3.67e-08 ***
`MD Diff SD`                -2.8648     0.8741  -3.277  0.00141 ** 
`School Size`               14.8102     5.2528   2.819  0.00572 ** 
`MD Diff SD`:`School Size`  -2.0972     0.6858  -3.058  0.00281 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.98 on 108 degrees of freedom
Multiple R-squared:  0.3343,    Adjusted R-squared:  0.3158 
F-statistic: 18.07 on 3 and 108 DF,  p-value: 1.413e-09
Code
AIC(fit_md_size_interact)
[1] 1015.913
Code
BIC(fit_md_size_interact)
[1] 1029.505

Diagnostic of Final Model, Fit 5: MD_WA_Size_interact

When we consider our diagnostics, our model seems to satisfy the assumptions of:

  • Linearity

  • Normality of Errors

  • Homoskedasticity: there is a slight increase in our Scale-Location plot, but not very steep.

Our model does have a few terms that violate Cook’s distance. We have 112 and Cook’s distance says the 4th, 47th, and 104th observations are in violation, since \(\frac{4}{n} \approx .04\).

When we remove those observations, and fit a new model, we find all of our explanatory variables to continue to be statistically significant with MD Diff SD: \(p =.0002\), School Size: \(p = 0.001028\), and there is still a statistically significant, \(p = 0.001462\), interaction effect between MD Diff SD and School Size.

For our adjusted model we have:

  • \(p-value: 1.027e-12\)
  • adjusted \(r^2 = .41\)
  • AIC = 964
  • BIC = 978

When we remove the outliers, the direction of our model and significance of our explanatory variables does not change.

Code
HSPhy_NextGen_SchoolSum
Code
plot(fit_md_size_interact, which = 1:6)

Code
cooks = 4/112
cooks
[1] 0.03571429
Code
 test<-HSPhy_NextGen_SchoolSum%>%
   filter(`School Code` != "00160505")%>%
    filter(`School Code` != "01520505")%>%
    filter(`School Code` != "08010605")


 fit_md_size_interact_adj = lm(`EorM%` ~ (`MD Diff SD`  + `School Size` + `MD Diff SD`* `School Size`), data = test)
summary(fit_md_size_interact_adj)

Call:
lm(formula = `EorM%` ~ (`MD Diff SD` + `School Size` + `MD Diff SD` * 
    `School Size`), data = test)

Residuals:
    Min      1Q  Median      3Q     Max 
-37.525 -13.400  -4.983  12.050  53.764 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                 60.2323     9.5275   6.322 6.39e-09 ***
`MD Diff SD`                -2.9847     0.7859  -3.798 0.000245 ***
`School Size`               16.2455     4.8103   3.377 0.001028 ** 
`MD Diff SD`:`School Size`  -2.0552     0.6288  -3.268 0.001462 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 19.64 on 105 degrees of freedom
Multiple R-squared:  0.4277,    Adjusted R-squared:  0.4114 
F-statistic: 26.16 on 3 and 105 DF,  p-value: 1.027e-12
Code
AIC(fit_md_size_interact_adj)
[1] 964.341
Code
BIC(fit_md_size_interact_adj)
[1] 977.7977

Visuals MD + School Size vs. ERM%

This image displays the association between MD Diff SD and EorM%. One can see that high variability in Mathematics and Data items is associated with lower Achievement for schools. Larger schools tend to perform better on the exam compared to smaller schools and have less variation in their mathematics performance.

Code
ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `MD Diff SD`, y = `EorM%`, color = `School Size`)) +
  geom_point() +
  geom_smooth(method="lm", se=T)+
   labs( y = "EorM%",
            title = "Variance in Mathematics vs. Student Achievement",
            x = "MD Diff SD",
            caption = "NextGen HS Physics MCAS")

Code
# ggplot(data = HSPhy_NextGen_SchoolSum, aes(x = `MD Diff SD`, y = `EorM%`, color = `EM Perf Stat`)) +
#   geom_point() +
#   geom_smooth(method="lm", se=T)

Hypothesis 2: Reporting Cateogy and School Performance

  • A school’s summary performance on items in a given content Reporting Category as measured by MF%, EN%, and WA%, is positively associated with the Reporting Category's weight within the exam.

Fit 1: Report_Cat_Weight

When we control for the variation in School Size and remove content reporting categories from our model, we have that MD Diff SD: \(p =.001\) and School Size: \(p = .006\) are both statistically significant and there is a statistically significant, \(p = 003\), interaction effect between MD Diff SD and School Size. For our model we have:

  • \(p-value: 1.41e-09\)
  • adjusted \(r^2 = 0.04382\)
  • AIC = 2706
  • BIC = 2718
Code
HSPhy_NextGen_SchoolSum<-HSPhy_NextGen_SchoolSum%>%
  mutate(`MF_Weight` = round(100*30/60))%>%
  mutate(`EN_Weight` = round(100*18/60))%>%
  mutate(`WA_Weight` = round(100*12/60))

#HSPhy_NextGen_SchoolSum
  

 HSPhy_Weight<-HSPhy_NextGen_SchoolSum%>%
  select(`MF_Weight`, `EN_Weight`, `WA_Weight`, `MD Diff SD`, `School Size`, `School Name`, `School Code`)%>%
   pivot_longer(c(1:3), names_to = "Report Cat", values_to = "Exam Weigt") #%>%

 HSPhy_Weight
Code
 HSPhy_Cat<-HSPhy_NextGen_SchoolSum%>%
  select(`MF%`, `EN%`, `WA%`, `School Code`, `MD Diff SD`, `School Size`)%>%
  pivot_longer(c(1:3), names_to = "Report Cat", values_to = "Report Cat%")%>%
   group_by
 
 HSPhy_Cat<-HSPhy_Cat%>%
   mutate(Report_Cat_Weight = case_when(
     `Report Cat` == "MF%" ~ 50,
     `Report Cat` == "EN%" ~ 30,
     `Report Cat` == "WA%" ~ 20,
     
   ))
 
 HSPhy_Cat
Code
   # HSPhy_Weight<- HSPhy_Weight%>%
   #   select(`School Name`, `School Code`, `Report Cat`, `Exam Weight`, `% Points`)%>%
   #  group_by(`School Name`, `School Code`, `Report Cat`)%>%
   # summarize(`Exam Weight` = mean(`Exam Weight`),
   #           `Report Cat %` = mean(`% Points`))%>%
   # ungroup()
   # 

fit_weight = lm(`Report Cat%` ~ `Report_Cat_Weight`, data = HSPhy_Cat)
 
summary(fit_weight)

Call:
lm(formula = `Report Cat%` ~ Report_Cat_Weight, data = HSPhy_Cat)

Residuals:
    Min      1Q  Median      3Q     Max 
-30.272  -9.496  -0.384   9.046  33.893 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       40.33036    2.10210  19.186  < 2e-16 ***
Report_Cat_Weight  0.23884    0.05906   4.044 6.54e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 13.5 on 334 degrees of freedom
Multiple R-squared:  0.04667,   Adjusted R-squared:  0.04382 
F-statistic: 16.35 on 1 and 334 DF,  p-value: 6.537e-05
Code
AIC(fit_weight)
[1] 2706.685
Code
BIC(fit_weight)
[1] 2718.136

Fit 2: Report_Cat_Weight, MD Diff SD

When we control for the variation in MD Diff SD,

Report_Cat_Weight: \(p =.03\) and MD Diff SD: \(p = 1.88e-05\) are both statistically significant.For our model we have:

  • \(p-value: 2.2e-16\)
  • adjusted \(r^2 = 0.37\)
  • AIC = 2568
  • BIC = 2584
Code
fit_weight_md = lm(`Report Cat%` ~ `Report_Cat_Weight` + `MD Diff SD`, data = HSPhy_Cat)
 
summary(fit_weight_md)

Call:
lm(formula = `Report Cat%` ~ Report_Cat_Weight + `MD Diff SD`, 
    data = HSPhy_Cat)

Residuals:
    Min      1Q  Median      3Q     Max 
-18.480  -8.721  -2.223   6.604  37.986 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       59.63911    2.25386  26.461  < 2e-16 ***
Report_Cat_Weight  0.23884    0.04801   4.975 1.05e-06 ***
`MD Diff SD`      -2.24586    0.17097 -13.136  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.98 on 333 degrees of freedom
Multiple R-squared:  0.3721,    Adjusted R-squared:  0.3683 
F-statistic: 98.66 on 2 and 333 DF,  p-value: < 2.2e-16
Code
AIC(fit_weight_md)
[1] 2568.397
Code
BIC(fit_weight_md)
[1] 2583.665

Fit 3: Report_Cat_Weight, MD Diff SD, School Size

When we control for the variation in MD Diff SD, we have that Report_Cat_Weight: \(p =4.37e-07\) and MD Diff SD: \(p = 4.10e-12\), and School Size: \(p = 4.33e-06\) are statistically significant and there is a statistically significant, \(p = 4.74e-07\), interaction effect between MD Diff SD and School Size. For our model we have:

  • \(p-value: 2.2e-16\)
  • adjusted \(r^2 = 0.41\)
  • AIC = 2546
  • BIC = 2569
Code
fit_weight_md_size = lm(`Report Cat%` ~ `Report_Cat_Weight` + `MD Diff SD` + `School Size` + `School Size`*`MD Diff SD`, data = HSPhy_Cat)
 
summary(fit_weight_md_size)

Call:
lm(formula = `Report Cat%` ~ Report_Cat_Weight + `MD Diff SD` + 
    `School Size` + `School Size` * `MD Diff SD`, data = HSPhy_Cat)

Residuals:
    Min      1Q  Median      3Q     Max 
-19.175  -7.607  -1.923   6.314  36.632 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                55.12005    3.32785  16.563  < 2e-16 ***
Report_Cat_Weight           0.23884    0.04633   5.155 4.37e-07 ***
`MD Diff SD`               -1.75112    0.24325  -7.199 4.10e-12 ***
`School Size`               6.83058    1.46172   4.673 4.33e-06 ***
`MD Diff SD`:`School Size` -0.98064    0.19083  -5.139 4.74e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.59 on 331 degrees of freedom
Multiple R-squared:  0.4187,    Adjusted R-squared:  0.4117 
F-statistic: 59.61 on 4 and 331 DF,  p-value: < 2.2e-16
Code
AIC(fit_weight_md_size)
[1] 2546.464
Code
BIC(fit_weight_md_size)
[1] 2569.367

Since this model has the lowest AIC, 2nd lowest BIC, and highest adjusted \(r^2\), all terms, we can conclude

  • A school’s summary performance on items in a given content Reporting Category as measured by Report Cat%, is positively associated with the Reporting Category's weight within the exam. Report_Cat Weight was statistically significant, even when we control for impact of MD Diff SD andSchool Size.

Diganostic of Fit3: fit_weight_md_size

When we consider our diagnostics, our model seems to satisfy the assumptions of:

  • Linearity
  • Normality of Errors
  • Homoskedasticity: there is a slight increase in our Scale-Location plot, but not very steep.

Our model does have some terms that violate Cook’s distance. We have 336 observations and Cook’s distance says the 141st, 310th, and 311th observations are in violation, since \(\frac{4}{n} \approx .01\).

When we remove those observations, and fit a new model, we find all of our explanatory variables to continue to be statistically significant

For our adjusted model we have:

  • \(p-value: 2.2e-16\)
  • adjusted \(r^2 = .49\)
  • AIC = 2437
  • BIC = 2460

When we remove the outliers, the direction of our model and significance of our explanatory variables does not change.

Code
HSPhy_Cat
Code
4/336
[1] 0.01190476
Code
plot(fit_weight_md_size, which = 1:6)

Code
 test<-HSPhy_Cat%>%
   filter(`School Code` != "01520505")%>%
    filter(`School Code` != "08010605")


 fit_weight_md_size_adj = lm(`Report Cat%` ~ `Report_Cat_Weight` + `MD Diff SD` + `School Size` + `School Size`*`MD Diff SD`, data = test)
summary(fit_weight_md_size_adj)

Call:
lm(formula = `Report Cat%` ~ Report_Cat_Weight + `MD Diff SD` + 
    `School Size` + `School Size` * `MD Diff SD`, data = test)

Residuals:
    Min      1Q  Median      3Q     Max 
-18.819  -6.956  -1.755   6.434  25.277 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                53.71342    3.04123  17.662  < 2e-16 ***
Report_Cat_Weight           0.24110    0.04243   5.683 2.95e-08 ***
`MD Diff SD`               -1.80114    0.22205  -8.111 1.04e-14 ***
`School Size`               6.84980    1.32939   5.153 4.47e-07 ***
`MD Diff SD`:`School Size` -0.87165    0.17376  -5.016 8.68e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 9.613 on 325 degrees of freedom
Multiple R-squared:  0.4926,    Adjusted R-squared:  0.4863 
F-statistic: 78.87 on 4 and 325 DF,  p-value: < 2.2e-16
Code
AIC(fit_weight_md_size_adj)
[1] 2437.103
Code
BIC(fit_weight_md_size_adj)
[1] 2459.898

Visualization of Fit3

Code
ggplot(data = HSPhy_Cat, aes(x = `MD Diff SD`, y = `Report Cat%`, color = `Report_Cat_Weight`)) +
  geom_point() +
  geom_smooth(method="lm", se=T)+
   labs( y = "Report Cat%",
            title = "Variance in Mathematics and Category Weight vs. Achievement",
            x = "MD Diff SD",
            caption = "NextGen HS Physics MCAS")

Conclusion

Mathematics, Mathematics, Mathematics

References

Chief State School Officers, Council of. 2021. “How States Can Support the Adoption & Effective Use of High Quality Standards-Aligned Instructional Materials.” https://753a0706.flowpaper.com/CCSSOIMPDCaseStudyOVERVIEWFINAL/#page=1.
Chingos, & Whitehurst, M. M. 2012. “Choosing Blindly. Instructional Materials, Teacher Effectiveness, and the Common Core.” https://www.brookings.edu/wp-content/uploads/2016/06/0410_curriculum_chingos_whitehurst.pdf.
DESE. 2022. “HighSchool Introductory Physics Item Report.” https://profiles.doe.mass.edu/mcas/mcasitems2.aspx?grade=HS&subjectcode=PHY&linkid=23&orgcode=04830000&fycode=2022&orgtypecode=5&.
Greer, W. 2018. “The 50 Year History of the Common Core.” The Journal of Educational Foundations 31 (3&4): 100–117. https://files.eric.ed.gov/fulltext/EJ1212104.pdf.
Papy, Mantil, J. P. 2020. “Lifting All Boats? Accomplishments and Challenges from 20 Years of Education Reform in Massachusetts.” https://annenberg.brown.edu/sites/default/files/LiftingAllBoats_FINAL.pdf.